Design and Analysis of a Dynamic Load Balancing Strategy for Large-Scale Distributed Association Rule Mining
نویسندگان
چکیده
Association rule mining is one of the most important data mining techniques. Algorithms of this technique search a large space, considering numerous different alternatives and scanning the data repeatedly. Parallelism seems to be the natural solution in order to be able to work with industrial-sized databases. Large-scale computing systems, such as Grid computing environments, are recently regarded as promising platforms for data and computation-intensive applications like data mining. However, to improve the performance and achieve scalability by using these heterogeneous platforms, new data partitioning approaches and workload balancing features are needed. The focus of this paper is to propose a dynamic load balancing strategy for parallel association rule mining algorithms in the context of a Grid computing environment. This strategy is built upon a distributed model which necessitates small overheads in the communication costs for load updates and for both data and work transfers. It also supports the heterogeneity of the system and it is fault
منابع مشابه
Association rule mining and load balancing strategy in grid systems
The parallel and distributed systems represent one of the important solutions proposed to ameliorate the performance of the sequential association rule mining algorithms. However, parallelization and distribution process is not trivial and still facing many problems of synchronization, communication, and workload balancing. Our study is limited to the workload balancing problem. In this paper, ...
متن کاملA Hierarchical Dynamic Load Balancing Strategy for Distributed Data Mining
Extracting useful knowledge from data sets measuring in gigabytes and even terabytes is a challenging research area for the data mining community. Sequential approaches suffer from a performance problem due to the fact that they have to mine voluminous databases. Parallelism is introduced as an important solution that could improve the response time and the scalability of these approaches. Howe...
متن کاملA Comparative Study of Association Rule Mining Algorithms on Grid and Cloud Platform
Association rule mining is a time consuming process due to involving both data intensive and computation intensive nature. In order to mine large volume of data and to enhance the scalability and performance of existing sequential association rule mining algorithms, parallel and distributed algorithms are developed. These traditional parallel and distributed algorithms are based on homogeneous ...
متن کاملA Novel Data Partitioning Approach for Association Rule Mining on Grids
Mining association rules refers to extracting useful knowledge from large databases. Algorithms of this technique are both data and computation-intensive, which make grid platforms very attractive for them. However, to exploit these platforms, new data partitioning features are required where the specificities of both association rule mining technique and grids must be taken into consideration....
متن کاملPerformance Evaluation of the Distributed Association Rule Mining Algorithms
One of the best-known problems in data mining is association rule mining. It requires very large computation and I/O traffic capacity, therefore several distributed and parallel association rule mining algorithms have been developed. However the association rule mining problem is NP complete, the execution time estimation of the algorithms can be very important, especially for load balancing or...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011